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1.   INTRODUCTION 

Problems  in  list-processing  have  generally  centered  around  the 
allocation  of  nodes  of  fixed  length.  A  more  interesting  but  relatively- 
less  studied  problem  involves  the  allocation  of  space  of  variable  length 
in  the  main  memory  of  a  computer.   In  situations  where  the  total  available 
memory  space  is  not  a  critical  quantity,  this  problem  could  easily,  though 
inefficiently,  be  disposed  of  by  using  a  multiple  number  of  nodes,  each 
of  fixed  length.   This  enables  us  to  utilize  the  generality  offered  by 
the  standard  list-processing  techniques.   The  need  for  more  fast  memory 
in  computer  systems,  however,  urges  us  to  conserve  as  much  space  in  core 
as  possible.   Hence  we  arrive  at  the  concept  of  "dynamic"  allocation  where 
each  "node"  is  created  according  to  the  specific  requirements  of  the 
allocation  request. 

It  seems  that  the  best  way  of  studying  the  behavior  of  any  system 
under  various  core  allocation  algorithms  would  be  by  simulation  on  a 
digital  computer.  The  characteristics  and  complexities  and  the  imposing 
structures  on  systems  can  best  be  understood  by  the  set  of  tools  provided 
by  this  type  of  simulation  (Fishman  [3])-   Rigid  mathematical  analysis 
in  the  area  of  dynamic  storage  allocation  has  been  scarce  (Knuth  Ik]), 
and  will  not  be  attempted  in  this  thesis  either.   Further,  simulation 
offers  a  technique  whereby  most  of  the  system  characteristics  can  be  studied 
with  a  little  extra  effort  on  the  part  of  the  programmer,  but  with  no 


inconvenience  to  other  users.   The  system  response  can  also  be  conveniently- 
monitored  by  introducing  suitable  statements  in  chosen  parts  of  the 
simulation  program.   It  is  the  intention  of  this  thesis  also  to  emphasize 
the  need  for  adopting  good  simulation  techniques  for  storage  allocation 
algorithms  and  to  suggest  methods  of  achieving  the  same. 

The  motivation  for  this  work  came  from  the  need  to  find  better 
allocations  schemes  for  the  PASCAL  system  on  the  PDP-11  at  the  University 
of  Illinois  (Stocks  [11]).   The  system  is  now  being  used  both  in  the 
educational  and  in  the  research  environments.   Any  dynamic  allocation 
policy  must  suit  the  requests  generated  from  large  production  programs  such 
as  the  PASCAL  compiler.   (it  has  been  observed  that  a  general  user  program 
rarely  generates  a  set  of  requests  which  is  as  critical  or  more  critical 
than  that  produced  by  the  compiler. ) 

The  most  important  limiting  factor  in  the  study  is  the  total 
available  core  space.   It  is  common  these  days  to  get  around  this  constraint 
by  introducing  the  concepts  of  virtual  memory,  paging  and  segmentation 
(Madnick,  etc.  [7],  Brinch  Hansen  [1],  for  example).   In  view  of  the 
complexity  that  such  techniques  would  introduce  into  the  run-time  package, 
the  authors  of  the  compiler  decided  to  break  down  the  compiler  into 
reasonably-sized  passes,  each  of  which  communicates  with  the  next  by  means 
of  files  on  secondary  storage  and  small  tables  in  core  (Krishnaswamy  [6]). 
hus  the  results  from  this  study  are  most  useful  for  a  system  in  which  the 
overlay  structure  is  decided  by  the  programmer  himself.   It  is  strongly 
felt  that  most  of  the  results  will  also  be  useful  for  most  minicomputer-based 
system  . 


The  project  endeavored  initially  to  simulate  the  PASCAL  system 
and  determine  the  performance  of  the  existing  first- fit  algorithm  in  terms 
of  measurable  quantities.   Various  other  known  algorithms  were  then 
attempted,  and  based  on  the  results,  new  algorithms  were  devised.   These 
algorithms,  which  form  a  hitherto  unventured,  though  exciting  class  of 
algorithms  will  be  presented  in  Chapters  k   and  5.  Attempts  are  made  to 
specify  the  areas  in  which  a  particular  algorithm  would  be  useful. 


2.      PROBLEM  FORMULATION 

2. 1       verview 

External  fragmentation  is  the  phenomenon  by  which  the  unused, 
free  area  of  the  main  memory  is  not  available  as  one  contiguous  block,  but 
forms  a  checkerboard  of  "holes"  in  the  storage.   Fragmentation  occurs,  and 
is  a  problem,  in  computers  where  limitations  imposed  by  hardware  or  by 
the  simplicity  of  the  linking  loader  prevent  the  dynamic  relocation  of 
jobs  or  job  steps.   The  situation  resulting  from  fragmentation  can  be 
depicted  by  an  example. 

Given  below  is  a  set  of  requests  to  be  applied  to  a  memory  of 
finite  size  L. 

1.  Allocate  'A'  of  size  a. 

2.  Allocate  ,B*  of  size  b. 

3.  Allocate  rC  of  size  c. 
k.      Deallocate  'B'. 

5.  Allocate  'D'  of  size  d  where 
i)  a  +  b  +  c  <  L 
ii)  b  <  d  <  L  -  (a+c) 
iii)   L  -  (a+b+c)  <  d 
Figures  2.1  through  2.5  show  the  effect  of  the  above  set  of 
req 


Figure  2.1  Initial  Memory  Configuration 
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Figure  2.2     After  Request  1 
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Figure  2.3     After  Request  2 
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Figure  2.U     After  Request  3 
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Figure  2.5     After  Request  k 


It  is  clear  that  in  spite  of  the  fact  that  enough  space  exists  in 
memory  to  accommodate  'D',  we  are  prevented  from  doing  so  because  the  space 
is  not  available  as  a  contiguous  piece. 

Extending  this  concept  to  a  large  memory  and  to  a  large  number 
of  requests  for  allocation  and  deallocation,  a  similar  situation  may  often 
be  reached.  All  the  used  spaces  in  core  may  be  so  distributed  in  memory 
that  not  even  one  of  the  intervening  free  spaces  can  accommodate  the  next 
request.  And  yet,  the  total  amount  of  noncontiguous  free  space  may  be 
many  times  the  requested  size. 

The  degree  of  fragmentation  is  dependent  on 
i)  the  size  of  the  memory 
ii)  the  size,  distribution  and  lifetime  of  the  requests,  and 
iii)  the  allocation  policy  adopted. 
Thus,  for  the  example  shown  in  Figures  2.1  through  2.5,  we  could  have 
satisfied  the  request  5  by 

i)  increasing  LtoL'>a+b+c+d 
ii)  breaking  D  into  two  parts  of  sizes  dT  <  b  and 
d"  <  L  -  (a+b+c)  where  d  =  d'  +  d" 
iii)  allocating  'C'  at  (L-c)  instead  of  at  (a+b). 

While  physical  limitations  prevent  constant  use  of  solution  (i), 
unpredictability  prevents  the  "trial-and-error"  approach  of  solution  (iii). 

olution  (ii)  may  sometimes  be  possible  if  a  link  is  maintained  between 
the  two  parts  of  the  whole.   This  turns  out  to  be  inconvenient  when  'D1  is 
a  load  module  written  in  position-dependent  code.   The  possibility  of 
memory  compaction  is  ruled  out  because  of  system  limitations  to  dynamic 
relocation,  as  mentioned  earlier. 
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Another  suggestion  (Clifton  [2])  claims  that  it  may  be  possible 
to  postpone  allocation  of  a  critical  request  until  some  deallocations  reduce 
the  checkerboarding  to  the  required  extent.   This  solution  tends  to  be 
impractical  in  a  monoprogramming  environment  with  a  number  of  serial  job 
steps,  where  allocation  implies  the  immediate  need  of  the  space  for 
computational  or  I/O  buffering  purposes. 

2.2  Notation 

The  following  mathematical  quantities  to  be  used  for  notational 
convenience,  are  being  defined  here. 

The  status  of  a  memory  M  after  the  k   request  has  been  satisfied 
can  be  defined  by 

M^  =   [L,Nf,Nu,F,U],  where 

L  =  the  total  size  of  the  memory  in  some  convenient 
units  (like  bytes  for  the  PDP-11) 
N  =  the  number  of  free  blocks 

N  =  the  number  of  used  blocks  (two  contiguous  used 
blocks  will  not  count  as  1  block) 
F  =  configuration  of  free  spaces 
U  =  configuration  of  used  spaces. 
F  is  an  N,,- tuple  and  U  is  an  N  -tuple : 
P=  {F1,F2,...,F^} 

U  =  [U^U,,,...,!^  ) 

u 

where  F.    =   (A.  ,i.)   and  U.    =   (A.,i.),    'A'   representing  the  low  address  of 
ill  3  3     3 

the  block  and   XV   representing  its  length. 


If  F.  =  (k.,H.)   then  £(F. )  will  indicate  the  length  vector  of  F. 
1111  1 

and  A(F. )  will  indicate  the  low  address  vector  of  F. .   X(U.)  and  A(U.) 

1  -^  o  J 

have  similar  significance. 

The  situations  in  Figures  2.1  through  2.5  can  then  be  represented 
symbolically  as : 

M  =  [  L,  1, 0,  { 0,  L) ,  { \}  ] ,  X   denoting  empty 

M1  =  [L,l,l,{(a,L-a)},{(0,a)}] 
Mg  =  [L,l,2,{(a+b,L-a-b)},{(0,a),  (a,b)}] 
M^  =   [L,l,3,{(a+b+c,L-a-b-c)},{(0,a),  (a,b),(a+b,c)}] 
M^  =  [L,2,2,{(a,b),  (a+b+c,L-a-b-c)},{(0,a),(a+b,c)}] 
For  any  system,  the  following  relations  are  always  true: 
i)  Nu(i)  -  Nf(i)  >  -  1 
ii)   |Nu(i+l)  -  Nu(i)|  =  1 
iii)   |Nf(i+l)  -  Nf(i)|  <  2 

2.3   Figures  of  Merit 

The  following  performance  criteria  were  selected  for  investigation 

in  each  of  the  algorithms  simulated. 

1.  Average  time  per  allocation  (t).   This  is  important  because  the  cost 
of  using  a  certain  algorithm  is  proportional  to  the  time  taken  for 
satisfying  the  requests,  which  should  be  minimized.   In  certain 
cases,  in  the  absence  of  any  limit  on  the  time,  blocks  may  be  moved 
physically  in  core  to  achieve  compaction. 

The  technique  utilized  to  instrument  t  involved  the  average 
number  of  com]/     is  (between  the  required  size  and  the  size  of 


a  free  block  under  observation)  made  by  the  system  before  finding 
a  suitable  block  of  core  for  an  allocation  request. 

2.  The  number  of  requests  (i)  satisfied  before  fragmentation  shuts 
out  an  allocation  request:  A  particular  simulation  run  is  stopped 
as  soon  as  this  situation  arises.   If  N«(l)  indicates  the  value 

of  N_  for  the  memory  configuration  M_  then  I  is  the  smallest  positive 

integer  such  that  the  request  size  IL.  ,  satisfies  the  relation 

Nf 

Z     Z(F .)  >  R_  ,  >  i(F. ),    i  =  1,2,...,N„(I) 
0=1    J 

Evidently,  the  nature  of  the  distribution  of  the  request  sizes  will 
be  one  of  the  factors  influencing  the  value  of  I. 

3.  The  least  size  (S)  of  the  biggest  area  of  free  core  given  by 

S  =  min  (   max  £(F.(l))) 
all  i  l<j<Nf(i)   J 

where  N„(i)  and  F.(i)  stand  for  N  and  F.  for  the  memory  configuration 
■'■J  J-      j 

M. .   This  quantity  is  a  measure  of  core  sufficiency.   It  gives  an 

idea  of  the  amount  by  which  the  request  sizes  may  be  increased. 

before  the  system  runs  out  of  core. 

h.      The  total  amount  of  free  core  (C  )  when   max  £(f .)  =  S.   It  is 

l<j<Nf   J 

hence  the  total  free  core  available  when  the  biggest  area  of  free 

core  is  the  minimum.   C  hence  helps  to  determine  whether  the 

free  spaces  were  distributed  all  over  the  memory  or  whether  most 

of  it  was  concentrated  at  a  single  spot.   It  is  safe  to  assume 

that  the  degree  of  fragmentation  is  less  if  S/C  is  closer  to  unity. 
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5.   The  mean  size  (W)  of  the  biggest  free  block: 


-   1  n 

W  =  -  Z  max    F.(i) 

n  1=1  15J<N  (i)  J 


where  n  is  the  total  number  of  requests  and  N_(i)  and  F.(i)  are 
N„  and  F.  for  the  memory  configuration  M. .   For  a  given  set  of 
requests,  while  small  differences  in  this  parameter  cannot 
determine  the  superiority  of  one  algorithm  over  another,  a 
large  difference  would  definitely  indicate  the  inferiority  of 
the  algorithm  with  the  smaller  mean. 

6.  The  critically  tolerant  size  (T).   This  can  be  described  as  the 
smallest  size  of  initial  free  memory,  L(M  ),  for  which  all  requests 
in  a  particular  set  can  be  satisfied.   The  value  of  T,  which  is 
dependent  on  the  request  set,  gives  an  indication  of  the  amount 

of  memory  that  plays  an  active  role  in  satisfying  that  request  set 
completely. 

7.  The  perturbability  (P).   Certain  allocation  request  sizes  may  change 
in  the  course  of  time.   For  instance,  a  compiler  may  need  to  be 
expanded  for  the  inclusion  of  a  new  feature,  or  the  data  bases  of 

an  airline  reservation  system  may  need  expansion  to  provide  for 
the  inclusion  of  additional  flights.   In  such  cases,  it  is  important 
that  the  system  does  not  run  out  of  core  merely  because  of  a  small 
increase  in  the  request  sizes. 
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This  situation  may  be  simulated  using  a  uniform  random  number 
generator  to  generate  the  percentage  by  which  any  given  allocation 
request  size  should  be  increased.   System  dependent  request  sizes, 
like  file  buffer  size,  which  remain  constant  from  run  to  iron,  should 
remain  unperturbed.   The  recognition  of  such  sizes  is  a  difficult 
task,  and  must  be  made  prior  to  the  simulation  runs.  Unusually 
large  concentrations  of  requests  will  be  noticed  at  these  sizes  if 
a  system  is  monitored. 

The  maximum  percentage  by  which  the  sizes  are  perturbed  shall  be 
termed  the  degree  of  perturbation.   It  is  clear  that  as  the  degree 
of  perturbation  is  increased,  a  situation  will  be  reached  when  the 
system  begins  to  run  out  of  core.   The  least  degree  of  perturbation 
at  which  this  occurs  shall  be  termed  the  perturbability  of  the  algorithm 
for  the  particular  set  of  requests. 

In  some  pathological  cases,  reducing  the  size  of  a  request  may 
cause  an  algorithm  to  fail.   In  the  example  of  Figures  2.1  through 
2.5,  if  b  >  d,  then  the  request  5  could  be  fulfilled  and  the  area  fD' 
could  be  allocated.   On  reducing  the  request  size  b  to  below  that  of 
d  the  fragmentation  causes  the  system  to  run  out  of  core.   However, 
such  cases  occur  rarely,  and  in  most  practical  situations  with  large 
numbers  of  allocations  and  deallocations,  an  increase  in  the  size  of 
a  request  or  a  number  of  requests  worsens  the  situation. 


t 


The  random  number  generator  was  obtained  by  the  technique  of  combining 
random  number  generators  to  obtain  a  "more  random"  sequence.   Chi -square 
statistics  from  run  tests  indicated  that  a  96%   randomness  can  be  easily 
obtained.   The  outline  of  the  design  of  the  generator,  attributed  to 
Marsaglia  and  MacLaren  [8],  was  obtained  from  Knuth  [5]. 
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The  above  figures  of  merit  have  been  found  to  be  quite  useful 
for  comparing  algorithms.   While  each  figure  changes  with  the  set  of  requests 
used,  a  good  idea  of  the  relative  merit  of  various  algorithms  may  be  obtained 
by  comparing  the  values  obtained  for  a  well-chosen  fixed  set  of  requests. 
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3.   SYSTEM  SIMUIATION  AND  MONITORING 

This  chapter  will  outline  certain  simulation  techniques  which 
have  been  found  to  be  suitable  for  the  study  of  storage  allocation 
algorithms.   Techniques  convenient  for  monitoring  the  system  behavior 
will  also  be  described. 

3.1  Simulation  Model 

For  the  purposes  of  this  simulation,  the  system  is  simply  that 
portion  of  the  PASCAL  run-time  package  which  receives  commands  for 
allocation  and  deallocation  of  space  in  core,  along  with  the  memory  space 
and  its  contents.  The  relationship  that  exists  between  the  system  and 
its  environment  can  be  represented  as  shown  in  Figure  3.1. 

The  blocks  within  the  dotted  lines  represent  the  system  to  be 
simulated  while  the  rest  of  the  blocks  indicate  the  environment.  A  simpler 
picture  would  be  as  shown  in  Figure  3.2. 

This  evidently  is  a  simplistic  model,  which  becomes  more  intricate 
on  the  introduction  of  a  feedback  loop,  whereby  the  system  environment  can 
take  remedial  action  in  case  of  inability  of  the  system  to  fulfill  a 
request.   This  is  typically  the  case  in  a  demand  paging  system,  where 
the  remedial  action  involves  the  swapping  out  of  certain  chosen  pages  from 
core.   The  current  study  is  limited  to  situations  where  no  distinction  is 
made  between  the  name  space  of  a  job- step  and  the  physical  space  associated 
with  it. 


11+ 


ALLOCATION 


PROGRAM 

BEING 
EXECUTED 


CORE 
REQUEST/ 


DEALLOCATION 


ROUTINE  TO 
SEARCH  FOR 
CONVENIENT 
FREE  BLOCK 


ROUTINE  TO 
SEARCH  FOR 
THE  BLOCK 
TO  BE 
DEALLOCATED 


(OPTIONAL) 


ROUTINE  TO 

LOAD  AT 

^PROPER  POINT 

FROM  SEC. 

STORAGE 


ERROR 
ROUTINE 


ROUTINE  TO 

UNLOAD 

INFORMATION 

BACK  TO 
SEC.  STORAGE 

(OPTIONAL) 
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3.2  Simulation  Data 

It  is  generally  the  practice  of  researchers  to  try  their  algorithms 
out  using  certain  standard  request  distributions  like  uniform,  erlang, 
exponential,  hyperexponential  or  combinations  therefrom.   In  these  cases, 
the  pseudo- random  number  generator  must  be  made  with  great  care,  particularly 
in  simulation  programs  working  under  stringent  space  restrictions. 

However,  standard  distributions  do  not  represent,  in  most  cases, 
the  actual  distribution  of  request  sizes  or  of  the  lifetime  of  the  requests. 
(The  lifetime  of  a  request  is  defined  as  the  time  interval  between  the 
allocation  of  a  request  and  the  subsequent  deallocation  of  the  corresponding 
block. )  Also,  an  algorithm  suitable  for  a  system  under  certain  conditions 
may  not  be  suitable  for  a  different  system  or  under  different  conditions. 

It  may  hence  prove  to  be  useful  to  study  the  actual  request  set 
generated  by  the  system.   Thereafter,  the  set  of  requests  used  as  data  for 
the  simulation  study  may  be  chosen  from  among  the  "typical"  sets,  as  the 
author  did,  or  may  be  designed  as  a  pseudo-random  set  conforming  to  the 
distributions  of  the  system-generated  set.   The  deterministic  set  could 
then  be  perturbed  to  note  the  effect  on  the  system  of  changes  in  the  sizes 
of  procedures,  arrays,  etc.  More  details  may  be  obtained  from  the 
paper  (Nair  [9]). 

3.3  System  Monitoring 

The  PASCAL  run-time  package  is  capable  of  generating  a  system 
log  which  monitors  all  requests  for  allocation  (GET)  and  deallocation  (FREE). 
A  section  of  a  sample  log  is  shown  below: 
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The  following  useful  information  can  be  extracted  from  the  log: 
i)  the  histogram  indicating  the  frequency  of  allocation 
requests  of  different  sizes 
ii)  the  distribution  of  the  lifetime  of  requests 
iii)  the  maximum  number  of  active  core  spaces  of  a 

particular  size  that  co-exist  in  memory.   This 
is  useful  in  the  implementation  of  boundary  algorithms 
to  be  described  in  Chapter  5.   It  must  be  noted  here 
that  the  initial  configuration  of  the  system  is 
explicitly  one  in  which  there  is  no  free  core  and 
that  the  system  frees  all  the  available  core  at  the 
outset.   It  is,  therefore,  possible  to  determine  most 
of  the  system  characteristics  in  a  time  independent 
manner  and  with  no  initial  assumptions. 

;bably  useful  to  mention  here  that  over  80f0  of  all  jobs 
PCh  and  education.-,  I  <  n\rLronments  on  the  PDP-11  here  are 
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compilation  jobs.   It  has  also  been  noticed,  as  mentioned  earlier,  that  these 
PASCAL  compilation  jobs  are  among  the  most  critical  where  core  requirements 
are  concerned.   Figures  3.3  and  3.^  show  the  frequency  histogram  and  the 
lifetime  distribution  respectively  for  a  typical  set  of  requests  produced 
by  the  compiler.   Figures  3.5  and  3.6  are  the  corresponding  graphs  for 
a  noncompiler  job. 

It  is  interesting  to  examine  the  characteristics  of  the 
distributions : 

1.  The  sizes  512  and  32  merit  special  attention.   In  the  particular 
PASCAL  system  under  study,  the  file  buffer  size  is  1000  octal  bytes 
or  512  decimal  bytes.   It  is  clear  that  input,  output  and 
intermediate  file  buffers  will  be  a  part  of  every  program  and 
hence  it  is  not  surprising  that  there  is  a  concentration  of  requests 
at  the  512  bytes  mark.   Similarly,  the  association  of  a  header  of 

32  bytes  with  every  file  buffer  explains  the  unusual  concentration 
in  the  region  of  32  bytes.   These  figures  (512  and  32)  are 
characteristic  of  the  implementing  system;  yet  it  is  reasonable 
to  expect  such  a  set  of  figures  with  every  system. 

2.  Other  than  the  requests  of  the  type  described  above,  the  rest  of 
the  chart  in  Figures  3.3  and  3-5  show  a  large  number  of  requests  of 
small  size  (<  150  bytes),  some  requests  of  very  large  size 

(>  10,000  bytes)  and  only  a  few  requests  of  intermediate  size. 
The  actual  numbers  clearly  depend  on  the  program  generating  the 
requests.  We  may  conclude  intuitively  that  the  request  sizes  which 
cause  peaking  in  the  frequency  histogram,  along  with  the  very  large 


Figure  3.3a 
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sized  requests  will  probably  govern  the  performance  of  various 
algorithms. 
3.  The  lifetime  distribution  indicates  a  function  decaying  in 

approximately  an  exponential  manner.   The  rate  of  decay  appears 
definitely  greater  than  unity.   To  determine  the  behavior  of  the 
individual  sizes,  the  lifetime  distribution  was  determined  for 
sizes  512  and  32  and  are  shown  in  Figures  3.7  and  3.8  respectively 
for  the  typical  compilation  job.   A  similar  behavior  is  noticed 
again. 

The  physical  implication  is  that  most  allocated  spaces  in  core 
are  short-lived  and  this  is  particularly  true  for  the  smaller  sized  requests. 
The  smaller  sizes  correspond  to  the  array  allocations  and  dynamically 
allocated  PASCAL  "records, "  while  the  large  requests  correspond  to  large 
program  segments  like  PASCAL  external  procedures.   It  seems  reasonable  again 
to  infer  that  the  lifetime  distributions  for  other  PASCAL- like  systems  will 
be  similar. 

3.^-  Stopping  Rules  and  Validation 

One  common  notion  is  that  a  simulation  run  should  continue  until 
the  allocation  algorithm  being  tested  is  unable  to  satisfy  a  request 
(e.g.,  Knuth  [h]   and  Shore  [10]).   This  technique  hence  presupposes  the 
existence  of  a  perennial  source  of  data,  like  that  obtained  from  some 
pseudo-random  generator.   One  drawback  to  this  technique  is  that  the  sequence 
of  requests  generated  may  not  be  practical.   For  instance,  there  always 
exists  a  probability,  however  small,  that  a  large  sized  request,  say  25K, 
will  appear  even  though  another  similar  sized  block  has  not  yet  been 
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deallocated  in  a  hOK  memory,  when  a  random  number  generator  is  used.   (The 
overlay  structure  may  prevent  this  from  happening  in  the  actual  system. ) 

It  is  not  impossible  to  take  all  such  special  factors  into 
account  while  designing  the  pseudo-random  number  generator — it  is  merely 
inconvenient.  A  more  pragmatic  approach  would  involve  tailoring  a  typical 
set  of  requests  actually  provided  by  the  system  so  as  to  achieve  a  certain 
degree  of  generality.   This  is  also  suitable  particularly  when  the 
performance  of  an  algorithm  is  required  to  be  known  only  in  comparison 
to  another,  rather  than  on  an  absolute  scale.   The  various  steps  in  the 
author's  simulation,  which  utilized  the  principles  just  mentioned,  can  be 
summarized  as  follows : 

1.  A  random  program  was  compiled  and  the  set  of  requests 
obtained  was  arbitrarily  termed  the  "standard"  set  S. 

2.  This  set  was  studied  to  note  the  existence  of  peaks 
in  the  frequency  histogram  like  at  sizes  512  and  32. 

3.  The  values  for  the  various  performance  criteria  of 
Chapter  2  were  determined  using  the  set  S,  and  the 
existing  first-fit  algorithm.   (Sizes  showing  peaks 
were  not  perturbed  while  performing  perturbability 
tests. ) 

h.      Step  3  was  repeated  for  other  algorithms. 

5.   Results  obtained  were  then  compared  with  those  for 

the  existing  algorithm. 
An  interesting  advantage  of  this  technique  is  that 
i)  the  use  of  the  system  generated  request  log 

automatically  ensures  the  validity  of  the 
at  ion  model,  and 
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ii )  the  check  with  the  existing  system  in  step  3  above 
ensures  the  validity  of  the  simulation  program. 
(Validation  of  the  algorithm  coding  itself  is 
however  a  strenuous  task  and  involves  checking  by 
manual  simulation. ) 
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k.      STANDARD  ALGORITHMS  AND  VARIATIONS 

This  chapter  describes  some  of  the  commonly  used  algorithms  and 
suggests  modifications  to  improve  their  performance. 

k.l     Best-Fit  Algorithm 
Let 

M.    =    [L,Nf,Nu,F,U]    , 

F  =   (P1,F2,...,FK  }    ,      and 

u  =  {\Jltu2,...,v    }  , 

u 

where  the  symbols  have  the  significance  indicated  in  Chapter  2. 

Let  a  be  the  size  of  core  requested.   In  the  best-fit  algorithm 

an  F„  is  to  be  found,  1  <  x  <  N„,  such  that 
x  —   —  i 

i)  i(Fv)  >  a,  and 
ii)  i(F„)  <  i(F. )  V  i  such  that  £(F. )  >  a 

x        1  1 

A  search  for  the  required  block  would  usually  take  us  through  all 
elements  of  the  set  F.   (in  some  cases  an  exact  match  with  the  required 
size  may  be  found,  in  which  case  the  search  may  be  terminated. ) 

A  PASCAL  procedure  for  simulating  this  would  be: 
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CONST 
TYPE 


SOMELARGENUMBER-32767;   "  FOR  PDP-11 


NODE=RECORD 

ADDRESS: INTEGER 
LENGTH:  INTEGER 
INFO:    INTEGER 
LPTR:    ~NODE 
END; 


VAR 


ACTIVE,  FREE: 


*NODE; 


"LOW  ADDRESS  OF  BLOCK" 
"LENGTH  OF  BLOCK" 
"IDENTIFYING  INFORMATION" 
"LINKED  LIST  POINTER" 


"LIST  HEADS  FOR  CIRCULAR  LISTS" 


PROCEDURE  FINDBESTBLOCK< A:  INTEGER; VAR  P:~NODE); 

-THIS  PROCEDURE  RETURNS  A  POINTER  P  TO  THE  SMALLEST  FREE  BLOCK  WITH" 
"LENGTH  GREATER  THAN  OR  EQUAL  TO  A" 


VAR 


BEGIN 


PTR, FOLLOWPTR: 

MINDIFF. 

DIFF: 


•^NODE) 

INTEGER; 

INTEGER; 


"FOLLOWPTR  ALWAYS  FOLLOWS  PTR" 
"MIN.  DIFFERENCE  IN  SIZE  FROM  A" 
"SIZE  OF  BLOCK  POINTED  AT  BY" 
"PTR- A" 


END; 


MINDIFF: =SOMELARGENUMBER; 

P:=NIL; 

PTR:  =( FOLLOWPTR:  =FREE)~.  LPTR; 

WHILE  <PTR\=FREE)MMINDIFF\=0)  DO  BEGIN 

DIFF:  =PTR-\  LENGTH-A; 

IF  (DIFF>=0)MDIFF<=MINDIFF)  THEN  BEGIN 
MINDIFF: =DIFF; 
P:  =PTR; 

END;     "IF  DIFF>=0.  ..." 

PTR:  =< FOLLOWPTR:  =tPTR)"\  LPTR} 

END;      "WHILE " 

"FINDBESTBLOCK" 


A  reduction  in  the  search  time  could  be  achieved  if  the  chain 
is  linked  according  to  increasing  length  of  the  free  blocks,  £(F. ),  instead 
of  in  address  order  (increasing  A(F. )  values).   The  average  number  of 
comparisons  would  then  be  approximately  N-/2  instead  of  N  ,  where  N  is 
the  number  of  elements  in  the  free  chain.   However,  another  NV2  comparison 
will  have  to  be  made,  on  an  average,  to 

i)  return  a  used  block  to  the  free  list,  and  to 
ii)  reposition  a  free  block  formed  out  of  fragmentation 
by  an  allocation  request. 
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Use   of  a  modified  structure  may  lead  to  an  average  of  the  order 
of  %pN     for  the  number  of  comparisons,    but  this  is   only  at  the   expense 
of  inefficiency  in  coalescing  of  adjacent  inactive  blocks.     When  N     is  not 
very  large,    and  this  is   often  the  case,    such  a  structure  may  not  prove 
to  be  worthwhile.      Ease  of  coalescing  is  a  great  advantage  and  hence  the 
ordering  by  A(F. )   is  preferred  over  the  ordering  by  i(F. ). 

k.2     First-Fit  Algorithm 

Given  memory  configuration  M.    and  request   size  a  as  before, 
this  algorithm  finds  F  ,    1  <  x  <  N     such  that 

i)     i(F   )   >  a,    and 
ii)     i(F.)  <  a  V  j   <  x 
Thus  the   search  stops  as   soon  as  a  block  bigger  than  the  requested  size  is 
found.      No  attempt  is  made  to  minimize  the  difference  between  £(F   )  and  a 
as  in  the  best-fit  algorithm. 

The   PASCAL  implementation  for  this  algorithm  would  be  as   shown 
below.      (The  global  declarations  are  not  repeated  here. ) 


PROCEDURE    FINDFIRSTBLOCK<A:  INTEGER;  VAR    P: ~NGDE); 
"THIS    PROCEDURE    PINDS    THE    FIRST    FREE    BLOCK    OF    SIZE>=A' 
VAR 


PTR, FOLLOWPTR:         'NODE; 


BEGIN 


P:  =NIL; 

PTR:  =( FOLLOWPTR: =FREE )   \  LPTR; 
WHILE     <PTR\=FREE)&<PTR\  LENGTH<A)     DO 
PTR:  =(  FOLLOWPTR:  =PTR  )  '\  LPTR; 
IF    PTR\=FREE    THEN    P:  =PTR; 
END.  "PINDPIRSTBLOCK" 
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1+.3  Comparison  of  the  Best- Fit  and  First-Fit  Algorithms 

It  may  be  noticed  that  the  best-fit  and  the  first-fit  algorithms 
would  perform  in  approximately  the  same  manner  if  the  ordering  of  the 
nodes  in  the  free  list  were  by  l(F. )  instead  of  by  A(F. ).   In  the  usual 
case  where  the  ordering  is  by  A(F. ),  it  is  obvious  that  the  first-fit 
algorithm  will  take  less  time,  on  an  average,  to  find  a  free  block  of 
suitable  size. 

Both  the  algorithms  were  simulated.   The  results  obtained  from 
the  first-fit  algorithm,  which  is  the  one  already  existing  on  the  system, 
are  depicted  graphically  in  Figures  U.  1  through  U.5.  All  the  graphs  are 
plotted  against  the  degree  of  perturbation,  a  term  which  has  been  defined 
in  Chapter  2. 

Figure  U.l  indicates  that  the  least  size  of  the  biggest  free  core 
decreases  with  the  degree  of  perturbation,  but  somewhat  unsteadily.   The 
dip  at  6%   and  the  rise  at  13$  are  probably  due  to  the  high  density  of  core 
utilization  and  would  not  probably  be  seen  if  the  total  memory  area  were 
very  large. 

Figures  k.2,    k.k   and  k. 5  indicate  that  the  ratio  S/CT  (which  is 
an  index  of  the  degree  of  fragmentation),  the  mean  size  of  the  total  free 
core  (w)  and  the  average  time  per  allocation  (t)  are  more  or  less  independent 
of  the  degree  of  perturbation. 

Figure  k.3   indicates  that  the  least  amount  of  total  free  core 
decreases  monotonically  with  the  degree  of  perturbation.   This  is 
expected  because  the  average  size  of  the  allocated  core  spaces 
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Figure  k.k 
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increases  as  the  degree  of  perturbation  increases.      The  approximately  linear 
drop  is   explained  by  the  fact  that  the  random  number  generator  is  based 
on  a  uniform  distribution. 

Figure  4.6  shows  the   least  size   of  biggest  free  core    (s)   for  the 
best-fit  algorithm.      The  observation  is  that  even  though  the  value   of  S  is 
generally  lower  in  the  best-fit  than  in  the  first-fit   for  the   same  degree 
of  perturbation,    the  perturbability  of  the  best-fit  algorithm  is   slightly 
better   (19%  to  18$)   than  that  of  the  first-fit  algorithm.      A  complete 
comparison  of  the  two  algorithms  is  presented  in  Table  4. 1. 


Table  4.1     Comparison  of  Best-Fit  Algorithm 
with  the  First-Fit  Algorithm 


First-Fit 

Best-Fit 

Least  size  of  biggest  free  (S) 

3682 

2626 

Total  core  at  that  point  (C™) 

U586 

4250 

Least  total  of  free  core 

3952 

39^ 

Mean  size  of  biggest  (W) 

19386 

19416 

Critically  tolerant  size  (T) 

38500 

39500 

Perturbability  (P) 

l8f0 

19% 

Average  time  per  alloc,  (t) 

4.101 

6.222 

(Except  where  otherwise   stated,    the  values   cited  are 
those  obtained  for  0°jo  degree   of  perturbation.  ) 
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Figure  k.6 
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A  remark  is  in  order  here  about  the  least  total  of  free  core. 
This  figure  is  independent  of  the  algorithm  if  all  algorithms  start  out 
with  the  same  amount  of  free  core.   It  is  dependent  only  on  the  degree 
of  perturbation  and  in  the  manner  shown  in  Figure  k.3-      However,  in 
algorithms  where  space  is  preallocated  for  specific  purposes,  the  amount 
of  space  preallocated  will  also  govern  this  figure,  as  will  be  seen  in 
Chapter  5. 

It  can  be  observed  that,    in  general,    the   first-fit  algorithm 
seems  to  be  more   suitable   for  the   set  of  requests  under  consideration. 

k.k     Cycle  Algorithm 

This  name  is  being  given  to  an  algorithm  first  suggested  by 
Knuth  [h] .      It  is  assumed  again  that  the  ordering  of  the  free  blocks  is 
by  A(Fi). 

A  pointer  c.  is  associated  with  every  memory  configuration  M. . 
The  algorithm  does  the  following.   For  an  allocation  request: 

1.  it  finds  F  ,  1  <  x  <  N^,  such  that 

x    —   —  f 

i)  i(F  )  >  a,  and 

ii)  i(F.  )  <  a  V  i  such  that  c.  <  i  <  x,  if  x  >  c,  or 
y     l  l  —  —I 

i(F.  )  <  a  V  i  such  that  (l  <  i  <  x  or  c.  <  i  <  Nf)  if  x  <  c. 

2.  it  sets  c.  ,,  *■  c.  if  £(F   )  >   a,  and 

1+1     1        x7 


c.  ,  *-  c.+l  if  i(F  )  =  a  &  c.  4   N_,  or 

1+1    i         x         1  '   f 


c.  .  «-  1  if  i(F  )  =  a  &  c.  =  N- 

1+1  x  1     f 


to 


Evidently  if  c.  =  N  =  1,  the  memory  has  no  free  spaces  left  and 

hence  c.  ,  is  not  defined.   But  this  situation  almost  never  occurs 
i+l 

in  practice. 

For  a  deallocation  request,  c.  keeps  pointing  to  the  same  free  block.   Hence 

if  U  is  the  block  to  be  deallocated,  then 
x 

c.  .  -  y  such  that  A(F  )  <  A(F  )  <  A(F  )  +  l(p  ) 
i+l   J  y    -       z±  y     y 

This  takes  into  account  the  possibility  that  coalescing  may  occur  on  either 
side  of  the  freed  block. 

A  PASCAL  implementation  of  this  algorithm  is  shown  below. 


VAR 

CYCLEPTR: ~NODE;  "GLOBALLY  DECLARED  POINTER' 

PROCEDURE  FINDCYCLEBLGCMA:  INTEGER;  VAR  P:~NODE); 

VAR 


PTR. FOLLOWPTR:   ^NODE; 


EEGIN 


P:  =NIL; 

PTR: =FOLLOWPTR: =CYCLEPTR; 

WHILE  (FOLLOWPTR^.  LPTR\=CYCLEPTR)  MPTR"\  LENGTH<A)  DO 

PTR:  =<FOLLOWPTR:  =PTR)-\  LPTR; 
IF  PTR\=CYCLEPTR  THEN  P: =PTRj 

"THE  SEARCH  IS  FACILITATED  BY  USING  A  CIRCULARLY" 
"LINKED  LIST  FOR  THE  FREE  NODES  WITH  THE  LENGTH" 
"FIELD<=0  FOR  THE  HEAD  NODE" 
"CYCLEPTR  NOW  NEEDS  TO  BE  RESET" 

IF  <P~  LENGTH=A)&<P"\  LPTR=FREE)  THEN  CYCLEPTR:  =FREE'\  LPTR 
ELSE  IF  P~  LENGTH=A  THEN  CYCLEPTR: =P~.  LPTR 
ELSE  CYCLEPTR: =P; 

"NOTE  THAT  OUTSIDE  THIS  PROCEDURE.  THE  ADDRESS  AND" 
"LENGTH  FIELD  OF  P  SHOULD  BE  CHANGED.  ALSO, 
"RELINKING  WILL  BE  NECESSARY  WHEN  THE  SIZE  OF  THE  " 
"FREE  BLOCK  FOUND  IS  EXACTLY  EQUAL  TO  A" 

END.      "FINDCYCLEBLOCK" 
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Similarly,  the  deallocating  routine  would  be  modified  to: 

PROCEDURE  C YCLEDE ALLOC ATE ( I: INTEGER  "BLOCK  IDENTIFYING  INFORMATION") 

VAR 

P:       ~NODE;  "A  SCRATCH  POINTER" 

BEGIN 

FIND<I,P>;        "P  POINTS  TO  THE  BLOCK  ASSOCIATED  WITH  " 

"INFORMATION  I" 
COALESCEIFPOSSIBLE<P);    "P  WILL  RETURN  A  POINTER  TO  THE  LOW" 

"ADDRESS  OF  THE  FREED  BLOCK  AFTER   " 
"COALESCING  WITH  ADJACENT  BLOCKS. 
"IF  POSSIBLE" 
IF  (CYCLEPTR-T  ADDRESS>=P-\  ADDRESS )& 

< CYCLEPTR  \  ADDRESS<P^.  ADDRESS+P^.  LENGTH)  THEN 

CYCLEPTR: =P;      "THIS  TAKES  CARE  OF  THE  CASE" 

"THAT  CYCLEPTR  MAY  BE  POINTING" 
"TO  A  COALESCED  BLOCK" 
INSERTINTOFREELIST(P);   "ALL  COALESCED  NODES  ARE  SIMULTANEOUSLY" 

"DELETED  FROM  THE  FREE  LIST" 
END;      "CYCLEDEALLOCATE" 

It  is  clear  that  since  the  "cycleptr"  (c. )  will  point  to  some 
free  block  in  memory,  not  necessarily  the  first,  the  chances  of  distributing 
the  allocations  more  evenly  around  the  memory  space  are  greater.   Besides 
avoiding  lop-sided  fragmentation,  this  technique  can  save  a  lot  of 
searching  time  during  allocation. 

1+.5  Rover  Algorithm 

A  new  algorithm  has  been  devised  based  on  the  cycle  algorithm 

just  described.  As  before  a  pointer  is  associated  with  every  memory 

configuration,  M. .   Let  it  be  called  r. .   The  rover  algorithm  does  the 

following.   For  an  allocation  request: 

1.   It  finds  f  ,  1  <  x  <  N^,  such  that 
x'   —   —  f 


i)  i(F  )  >  a,  and 


ii)  i(F.  )  <  a,  Vi  such  that  r.  <  i  <  x,  if  x  >  r  or 

i(F.  )  <  a,  Vi  such  that  (1  <  i  <  x  or  r.  <  i  <  Nf )  if  x  <  t± 


k2 


2.      It  sets  r.    n  *-  r.    if  ^(F  )  >  a,    and 
1+1  1  x 

r±+1  -  r±   +  1  if  i(Fx)  =  a  &  r .  ^  Nf,  or 

ri+1  -  1  if  i(Fx)  =  a  &  r±  =  Nf  . 

Thus  this  portion  is  similar  to  the  corresponding  portion  for  the  cycle 
algorithm. 

During  deallocation,  however,  the  pointer  is  always  reset  to 
the  point  where  the  deallocation  took  place.  Again,  if  U  is  the  block 
of  active  core  which  has  just  been  deallocated, 

r.  .  -  y  such  that  A(F__)  <  A(U  )  <  A(F  )  +  i(F  ) 
i+-L  y  *-  y  y 

This  algorithm  would  be  written  in  PASCAL  as : 

VAR 

ROVERPTR:        "NODE; 

PROCEDURE  FINDRGVERBLOCMA: INTEGER; VAR  P:  NODE); 

"THIS  PROCEDURE  IS  THE  SAME  AS  FOR  F I NDC YCLEBLOCK  EXCEPT  FOR  THE" 
"USE  OF  ROVERPTR  INSTEAD  OF  CYCLEPTR" 


END,     "FINDROVERBLGCK" 

PROCEDURE  RO VERDEALLOCATE (1:1 NTEGER ) ; 
VAR 


P:        'NODE, 


BEGIN 


FIND( I ,  P) J 

COALESCEIFPOSSIBLE(P);    "P  HAS  THE  SAME  SIGNIFICANCE" 

"MENTIONED  BEFORE" 
ROVERPTR  =P;      "NOTE  THE  DIFFERENCE  HERE" 
I NSERT I NTOFREEL I ST ( P ) ; 
END)      "RGVERDEALLOCATE" 
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4.6  Comparison  of  Performances  (Cycle  with  Rover) 

In  the  case  of  the  rover  algorithm,  the  pointer  "roves"  back  and 
forth  always  pointing  to  the  latest  point  of  allocation  or  deallocation. 
In  the  cycle  algorithm,  the  pointer  keeps  moving  forward  and  around  the 
circularly  linked  list  of  free  nodes  in  a  "cyclic"  manner.   The  degree  of 
freedom  allowed  hence  is  greater  for  the  pointer  in  the  rover  algorithm 
than  in  the  cycle  algorithm.   Farther,  an  allocation  request  is  often 
encountered,  whose  size  is  the  same  as  the  size  of  the  block  just  freed. 
These  two  points  help  in  reducing  the  degree  of  fragmentation  for  the 
rover  algorithm  in  comparison  to  the  cycle  algorithm.  Results  from 
simulation  are  presented  in  Table  4.2.   Simulation  with  several  data 
samples  show  that  the  set  of  input  data  samples  that  were  successfully 
allocated  by  the  cycle  algorithm  forms  a  subset  of  the  set  for  the  new 
rover  algorithm. 


Table  4.2  Comparison  of  Cycle  Algorithm  with 
Rover  Algorithm 


Cycle 

Rover 

Least  size  of  biggest  free  (S) 

- 

- 

Total  core  at  that  point  (C  ) 

- 

- 

Mean  size  of  biggest  (W) 

16978 

18832 

Critically  tolerant  size  (T) 

45700 

43600 

Perturbability  (P) 

afo 

0% 

Average  time  per  alloc,  (t) 

1.051 

1.084 

kh 


An  increase  in  the  average  time  per  allocation  (about  3%)   is 
noticed  for  the  rover  algorithm  in  comparison  to  the  cycle  algorithm.   This 
difference  can  be  explained  by  the  fact  that  "until  some  significantly 
large  request  for  allocation  comes  along,  the  "cycle  pointer"  keeps  pointing 
to  the  front  of  an  unventured  area  of  memory.   Thus  for  a  greater  number 
of  requests  than  in  the  rover  algorithm,  the  required  block  will  be 
obtained  after  the  very  first  comparison. 

The  perturb ability  is  seen  to  be  zero  for  both  algorithms.  This 
means  that  the  algorithms  were  unable  to  satisfy  the  standard  set  of 
requests  obtained  from  the  PASCAL  compiler.   Both  S  and  C  are  hence 
meaningless  when  a  request  set  is  not  satisfied  by  an  algorithm. 

Thus  there  are  situations,  like  the  PASCAL  system  under 
investigation,  where  neither  of  the  modifications  (cycle  or  rover)  is 
suitable  because  of  the  particular  nature  of  the  requests.  An  example 
will  now  be  cited  to  aid  in  the  analysis  of  this  failure. 

Consider  the  memory  configuration 

M.  -  [L,N_,N  ,F,U]  with 
1       f  u 

F  =  { (a+b,c),  (a+b+c+d,  L-a-b-c-d))  and 

U  =  { (0, a),  (a,b),  (a+b+c,d)},  where  L  -  (a+b+c+d)  >  a  +  c 


This  is  pictured  in  Figure  k.7. 


m 

m 

w, 

0     a    a+b  atb+c  a+b+c+d 

v//Ax iBed 


free 


Figure  k.7     Configuration  After  Request  i 
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Now  suppose  the  requests  following  are: 

Request  i+1:  Deallocate  -  from  location  0 
Request  i+2 :  Allocate  -  size  fe',  e  >  a,  e  >  c 
Request  i+3:  Allocate  -  size  'f,  f  <  a 
Request  i+k:     Allocate  -  size  'g',  g  <  c 
Request  i+5:  Allocate  -  size  'h', 

L  -  (a+b+c+d+e)  >  h  >  L  -  (a+b+c+d+e+f+g) 
Irrespective  of  the  initial  position  of  the  roving  pointer  or  the 
cyclic  pointer,  it  is  clear  that  this  set  of  requests  will  not  be  satisfied 
by  the  rover  or  cycle  algorithms  but  will  be  satisfied  by  both  the  best-fit 
as  well  as  the  first- fit  algorithms. 

It  appears  then  that  these  modified  algorithms  are  not  suitable 
when  a  large  request  for  allocation  follows  a  number  of  small  requests, 
with  no  deallocations  from  a  region  behind  the  pointer  and  towards  the 
front  end  of  memory.   The  simulation  study  lends  credence  to  this  surmise. 
A  comparison  of  the  four  algorithms  with  different  memory  sizes  is 
provided  in  Figures  k.Q   through  4.10. 

k.7     Combination  of  Best-Fit  and  Rover  Algorithms 

It  has  just  been  mentioned  that  failure  in  the  cycle  and  rover 
modifications  to  the  first- fit  algorithm  occur  usually  when  a  large  request 
for  allocation  follows  a  number  of  small  requests  (with  certain  restrictions 
on  the  region  of  deallocation).   The  failure  in  the  actual  simulation 
occurred  for  a  request  size  of  21K  bytes,  which  is  seen  to  be  comparable 
to  the  total  memory  size,  L,  of  k2K  bytes. 
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Figure  k.Q 
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Figure  k.9 
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Figure  U.10 
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This  observation  has  led  the  author  to  discover  an  allocation 
algorithm,  the  principle  behind  which  is  expected  to  have  tremendous  use 
not  only  to  PASCAL- like  systems  but  to  dynamic  allocation  algorithms  in 
general.   The  algorithm  uses  a  technique  which  switches  dynamically  between 
algorithms  to  obtain  better  performance. 

When  the  largest  area  of  free  memory  reduces  to  a  level 
comparable  with  that  of  the  bigger  allocation  requests,  it  is  necessary 
to  make  more  efficient  utilization  of  the  free  core.   Failure  in  the 
cycle  and  rover  algorithms  was  brought  about  by  the  inability  of  the 
algorithms  to  detect  such  situations.   At  the  same  time,  the  best-fit 
algorithm  essentially  detects  such  a  situation  on  every  occasion.   (This 
over-meticulousness  of  the  best-fit  algorithm  results  in  the  production 
of  a  conglomeration  of  tiny  inactive  pieces  which  increases  the  degree 
of  fragmentation. )  It  then  seems  appropriate  to  use  the  rover  algorithm 
at  all  times  except  when  the  value  of  the  total  free  memory,  F  ,  (or  almost 
equivalently,  the  size  of  the  biggest  free  block)  exceeds  a  certain  cutoff 
value,  C,  at  which  time  better  utilization  of  the  available  space  can  be 
made  using  the  best-fit  algorithm. 

The  algorithm  will  thus  consist  of  the  following  basic  steps: 
Al:  Check  if  the  next  request  is  one  for  allocation  or  for  deallocation. 

In  the  latter  case  go  to  A2,  in  the  former  case  go  to  A3. 
A2 :   Perform  the  deallocation  by  the  standard  technique,  coalescing 

adjacent  free  blocks  if  possible,  then  reset  the  "roverptr"  to 

the  point  of  deallocation,  and  update  the  value  of  total  free  core 

available.      Go  to  Al. 
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A3:  If  the  total  free  memory  space  (F  )  is  less  than  the  cutoff  value 

(C)  employ  the  Rover  algorithm  to  look  for  the  required  space, 

else  employ  the  best-fit  algorithm.   In  either  case,  got  to  AU  if 

search  is  successful  and  A5,  if  not. 
Ak:     Reset  the  "roverptr"  appropriately  to  the  point  of  allocation, 

update  the  total  available  memory  space  (F )  and  go  back  to  Al 

for  the  next  request. 
A5:   Indicate  that  no  core  is  available,  and  terminate.   (in 

multiprogramming,  continue  with  the  other  jobs,  if  there  are 

any.  ) 

A  possible  PASCAL  implementation  for  this  algorithm  is  shown 
below : 

PROCEDURE  FINDBLOCK( A: INTEGER;  VAR  P:~N0DE>; 

PROCEDURE  FINDBESTBLOCK< A: INTEGER;  VAR  Pr^NODE); 

VAR 

PTR, FOLLOWPTR:   "NODE; 
MINDIFF,  DIFF:    INTEGER; 

BEGIN 

MINDIFF: =SOMELARGENUMBER; 
P.  =NIL; 

PTR:  =< FOLLOWPTR:  =PTR)^.  LPTR; 
WHILE  (PTR\=FREE)&(MINDIFF\=0)  DO  BEGIN 
DIFF:=PTR~  LENGTH-A; 

IF  <DIFF>=0)&<DIFF<MINDIFF)  THEN  BEGIN 
MINDIFF: =DIFF; 
P:  =PTR; 
END;     "IF  DIFF>=0.  .  .  " 
PTR:  =( FOLLOWPTR:  =PTR)"\  LPTR; 
END;     "WHILE" 
END;      "FINDBESTBLGCK" 

PROCEDURE  FINDROVERBLOCK<A: INTEGER;  VAR  P:~NODE>; 

VAR 

PTR. FOLLOWPTR:  ~NODE; 

BEGIN 

P  =NIL; 

PTR: =FOLLOWPTR  =ROVERPTR; 

WHILE  (FOLLOWPTR^.  LPTR\=ROVERPTR ) &( PTR~  LENGTH<A)  DO 
PTR:  =<  FOLLOWPTR:  =PTR)~.  LPTR; 

IF  PTRX-ROVERPTR  THEN  P  -PTR; 
END.      "FINDROVERBLOCK" 
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BEGIN 

I F  TOT ALA VA I L ABLESP ACE>CUTOFF VALUE 
THEN  F I NDBESTBLOCK  <  A.  P ) 
ELSE  F I NDROVERBLOCK ( A,  P ) ; 
IF  <PV=NIL)  THEN 

IF  <P^.  LENGTH=A ) & < P~  LPTR«=FREE)  THEN 

ROVERPTR:  =FREEA.  LPTR 
ELSE  IF  P~  LENGTH=A  THEN  ROVERPTR:  =P-\  LPTR 
ELSE  ROVERPTR: =P; 
END;     "FINDBLOCK" 


PROCEDURE  ALLOCATE (A: INTEGER); 

VAR 

P,  Q:     ~NODE; 

BEGIN 

FINDBLOCK<A,  P); 

IF  P=NIL  THEN  SYSTEMHASRUNOUTOFCORE 

ELSE  BEGIN 

NEW<Q);  "CREATES  NEW  NODE" 

Q/\  LENGTH:  =A; 

Q^.  INFO: =APPROPR I ATE INFORMATION; 

GT.  ADDRESS:  =P">.  ADDRESS; 

IF  P*\  LENGTH\=A  THEN  BEGIN 

P~.  ADDRESS:  =P~.  ADDRESS+A; 
P-\  LENGTH:  =P~.  LENGTH-A; 
END      "IF  P-\  LENGTH\=A" 

ELSE  REMOVE < P. FREE);     "REMOVE  FROM  FREE  LIST" 
MOVE < Q, USED ) ;    "INSERT  Q  AT  THE  PROPER  PLACE  " 

"IN  THE  USED  CHAIN" 
TOT ALAVA I L ABLESPACE : =TOT AL AVA I LABLESPACE-A; 
END;     "ELSE  BEGIN" 
END*     "ALLOCATE" 


PROCEDURE  DEALLOCATE (INFO: INTEGER); 

VAR 

P,  Q      '"NODE; 

BEGIN 

FINDCINFO. Q);    "Q  IS  THE  NODE  WITH  THE  REQUIRED  INFO" 
TOTALAVA I L  ABLESP  ACE:  =TOTALAVAILABLESPACE+Q/N.  LENGTH; 
COALESCEIFPOSSIBLE(P);   "P  RETURNS  A  POINTER  TO  THE 

"LOW  ADDRESS  OF  THE  FREED  BLOCK" 
"AFTER  ANY  POSSIBLE  COALESCING  " 
ROVERPTR: =P; 
I NSERT I NTOFREEL I ST < P ) ; 
END*     -DEALLOCATE" 
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It  seems  reasonable  to  expect  some  of  the  characteristics  of 
each  of  the  algorithms  when  two  algorithms  are  combined  in  the  indicated 
manner.   The  degree  of  dominance  of  one  algorithm  over  the  other  is 
determined  by  the  cutoff  value.   Thus  the  sets  of  requests  satisfied  by 
this  algorithm  should  include  some,  if  not  all,  of  those  sets  satisfied 
by  one  (best-fit,  in  this  case)  but  not  by  the  other  (rover).   Further, 
the  average  time  taken  per  allocation  is  expected  to  be  intermediate  to 
the  time  taken  by  the  fast  rover  algorithm  and  that  taken  by  the  slow 
best- fit  algorithm. 

The  above  conjectures  were  verified  by  actual  simulation  of  the 
algorithm.   Various  cutoff  values  were  used,  a  value  of  0  corresponding 
to  the  absolute  best-fit  algorithm  and  a  value  of  U2100  corresponding  to 
the  rover  algorithm.   The  results  are  summarized  in  Table  k.3   and 
graphically  presented  in  Figures  U.ll  through  k. 1^. 

It  is  noticed  that  while  the  combination  manages  to  salvage  the 
good  points  of  both  algorithms,  it  inherits  some  of  the  bad  traits  too. 
Thus  a  deterioration  is  noticed  in  the  perturbability  (about  2$%   for  a 
cutoff  of  25000)  from  the  best-fit  case.   This  probably  is  offset  by  the 
fact  that  the  time  taken  per  allocation  has  reduced  in  the  corresponding 
case  by  about  60%.   (The  actual  time  taken  will  be  a  little  more  because 
of  the  extra  overhead  involved  in  maintaining  pointers  and  checking  the 
total  amount  of  free  core  remaining. ) 
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Figure  k. 13 
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k.8     Other  Combination  Algorithms 

Tables  k.k   and  k.5   indicate  the  results  obtained  by  using  best-fit 
and  cycle  algorithms  and  first-fit  and  rover  algorithms  respectively. 

Table  k.k   indicates  that  there  is  little  difference  between 
using  the  cycle  and  rover  algorithms  in  combination  with  the  best-fit 
algorithm.   Use  of  the  cycle  algorithm  seems  to  be  beneficial  where  time 
is  concerned,  while  use  of  the  rover  algorithm  seems  beneficial  where  the 
degree  of  fragmentation  is  concerned.   The  choice  in  most  cases  would  fall 
on  the  use  of  the  rover  algorithm  in  combination  with  the  best-fit 
algorithm. 

Use  of  the  first-fit  algorithm,  as  seen  from  Table  k.5,    is  of 
little  benefit.   The  perturb ability  is  higher  and  degree  of  fragmentation 
is  worse  than  with  the  use  of  the  best-fit  algorithm.   This  can  be 
explained  by  the  fact  that  frugal  utilization  of  space  is  important  when 
the  total  free  memory  is  low,  as  we  have  seen  earlier.   This  is  achieved 
by  utilizing  the  best-fit  algorithm,  but  not  guaranteed  while  using  the 
first- fit  algorithm  in  combination  with  the  other  algorithms. 
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5.     BOUM)ARY  ALGORITHMS 

5.1  Introduction 

The  class  of  algorithms  being  termed  the  boundary  algorithms 
here,  involves  the  subdivision  of  the  initial  free  memory  into  a  set  of 
areas,  each  of  which  may  be  used  for  the  allocation  of  prespecified  request 
sizes  only.   Thus  these  algorithms  attempt  to  minimize  fragmentation  by 
separating  those  sizes  of  specific  requests  suspected  to  be  the  cause  of 
problems,  and  by  assigning  space  for  them  right  at  the  beginning. 

Evidently,  this  class  of  algorithms  is  highly  system  dependent 
and  to  some  extent  dependent  on  the  request  set  also.   In  the  extreme  case, 
the  entire  set  of  requests  could  be  scanned  manually  and  the  allocations 
could  be  hand  manipulated  to  achieve  minimum  fragmentation.   Such  a 
solution,  besides  being  impractical,  cannot  even  be  approached  for  request 
sets  containing  thousands  of  requests. 

The  concept  of  subdividing  the  memory,  however,  leads  to  some 
interesting  algorithms,  as  shown  below. 

5.2  Subdivision  by  Size  Range 

It  has  been  seen  earlier  that  the  presence  of  a  few  very  large  sized 
allocation  requests  in  the  midst  of  a  conglomeration  of  small  requests, 
under  certain  conditions,  leads  to  extensive  fragmentation.  An  apparent 
solution  to  this  problem  would  be  to  allocate  requests  having  sizes  within 
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a  certain  range  into  one  area,  within  another  range  into  another  area,  and 
so  on.   This  immediately  leads  to  an  uncertainty  regarding  the  choice  of 
the  ranges  and  regarding  the  amount  of  core  to  be  reserved  for  each  of  the 
ranges.   Further,  the  allocation  policy  must  decide  which  area  should  be 
searched,  should  there  be  no  space  in  an  area  determined  by  an  incoming 
request. 

An  elegant  solution  exists  when  the  number  of  preallocated  areas 
is  2.  It  involves  the  use  of  the  first-fit  algorithm  with  the  search 
being  initiated  from  one  end  of  the  memory  if  the  request  size  is  less  than 
a  certain  specified  threshold  and  from  the  other  if  it  is  greater  than  the 
threshold.  In  both  cases,  allocation  is  from  the  front  of  the  area  found. 
This  technique,  unfortunately,  does  not  reduce  the  degree  of  fragmentation 
much  if 

i)  the  number  of  different  request  sizes  on  either  side 

of  the  threshold  are  approximately  equal,  and 
ii)  there  are  many  requests  of  different  sizes,  on  the 

higher  side  of  the  threshold,  which  coexist  (i.e.,  at 

the  same  time)  in  memory. 
(The  problem  arises  because  of  the  small  "holes"  existing  between  large 
active  areas.   These  holes,  created  by  a  slight  difference  in  the  size  of 
the  large  requests,  may  be  smaller  than  the  threshold  and  yet  may  never 
be  utilized  by  requests  smaller  than  the  threshold. ) 

The  PASCAL  compiler,  as  seen  from  Figure  3.3  generates  a  set  of 
requests  a  very  large  percentage  of  which  are  of  sizes  less  than  5000  bytes, 
^quests  of  very  large  size  correspond  to  the  various  passes  of  the 
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compiler,  each  of  which  is  unloaded  from  core  by  the  time  the  next  comes  in. 
This  overlay  structure  thus  ensures  that  if  the  threshold  were  made  slightly 
smaller  than  the  size  of  the  smallest  pass,  then  "holes"  of  the  form 
mentioned  earlier  would  never  occur. 

The  result  of  implementing  such  an  algorithm  is  shown  in 
Figures  5.1  through  5.3. 

A  careful  review  of  the  action  of  this  algorithm  indicates  that 
nearly  similar  results  would  be  obtained  if,  on  finding  a  suitably- si zed 
free  space,  the  first-fit  algorithm  were  made  to  allocate  core  towards 
the  end  of  that  space  for  a  request  size  greater  than  the  threshold, 
rather  than  the  beginning  of  that  space  as  it  normally  does.   (For  sizes 
smaller  than  threshold  the  normal  first-fit  would  be  employed. )  For  the 
specific  example  shown  in  Figures  5.1  through  5.3*  the  results  from  this 
modification  would  be  identical  to  that  shown  in  Figure  5.3. 

Results  from  simulating  such  an  algorithm  with  varying  threshold 
values  are  tabulated  in  Table  5-1  and  depicted  graphically  in  Figures  ^>.h 
through  5.7. 

An  immediate  revelation  is  that,  with  a  threshold  of  0,  the 
search  time  is  very  small.   This  can  be  explained  by  the  fact  that,  at 
least  in  the  initial  stages,  the  required  size  is  found  at  the  very  first 
attempt  and  the  allocation  is  performed  always  on  the  far  side  of  the 
free  space.  A  side  effect  to  this  is  the  continual  erosion  of  the  first 
free  space,  which  in  the  initial  stages  happens  to  be  the  biggest  also. 
After  a  stage  there  is  no  contiguous  core  left  to  allocate  even  medium  sized 
requests,  let  alone  large  sized  requests.   This  results  in  the  poor  figures 
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Figure  ^>.l     Memory  Configuration  on  Arrival  of 
a  Large- sized  Request 
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Figure  5.2  Memory  Configuration  after  Satisfying 
Request  by  First-fit  Algorithm 
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Figure  5.3  Memory  Configuration  after  Satisfying 
Request  by  New  Threshold  Algorithm 
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Figure  5.U 
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Figure  5.5 
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Figure  5.6 
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for  perturbability,  etc.   (An  analogous  situation  would  exist  if  allocation 
were  continually  performed  with  no  deallocations,  but  with  tags  indicating 
whether  a  block  is  active  or  not,  as  in  garbage  collection  routines. ) 

For  a  threshold  value  of  5000  and  above,  the  expected  results 
are  seen,  with  the  final  plateau  being  reached  at  30000,  after  which  the 
performance  is  similar  to  that  of  the  first-fit  algorithm.   The  performance 
in  the  initial  stages  is  akin  to  that  of  the  best-fit  algorithm,  but 
without  its  disadvantage  of  long  searches.   However,  it  may  be  safely 
concluded  that  given  the  choice  between  the  first-fit  algorithm  and  the 
modification  using  threshold,  the  former,  with  its  reduced  overhead  would 
be  a  better  bet. 

5.3  Preallocation  of  Buffers 

A  second  look  at  Figure  3.3  indicates  that  a  large  number  of 
allocation  requests  are  made  for  the  size  512.   It  has  been  mentioned 
earlier  also  that  these  requests  correspond  to  allocation  of  file  buffer 
space  of  1000  octal  bytes  or  512  decimal  bytes.   If  every  request  of 
size  512  were  allocated  only  after  the  previous  request  of  size  512  had 
been  deallocated,  it  would  be  worthless  to  orient  any  algorithm  to  take 
any  special  action  on  512 -byte  sized  request. 

By  using  the  monitoring  program  (Section  3.3  (iii))  an  estimate 
was  obtained  of  the  maximum  number  of  blocks  of  size  512  which  exist  in 
core  at  any  given  time.   This  figure  was  found  to  be  Ik   for  a  typical 
PASCAL  compilation  job  and  smaller  for  all  none ompilat ion  jobs. 

It  then  appears  that  some  advantage  may  be  obtained  by  reserving 
a  fixed  space  for  allocation  requests  of  size  512.   This  space  would  have 
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to  be  of  a  size  which  is  a  multiple  of  512,  and  should  ideally  be 
sufficient  to  accommodate  all  512 -byte  sized  requests  that  can  ever  exist 
simultaneously.   Two  problems  exist  with  this  approach: 

i)  The  maximum  number  just  described  will  vary  from 
request  set  to  request  set  and  cannot  be  accurately 
predicted, 
ii)  Even  if  it  were  possible  to  get  this  number, 
utilization  of  this  space  would  never  be  even 
close  to  100$,  thus  preventing  requests  of  other 
sizes  from  utilizing  the  unused  space  and 
inevitably  leading  to  excessive  fragmentation. 
The  main  advantage  that  could  be  derived  from  such  a  system  is  a 
reduction  in  the  time  taken  for  allocation,  particularly  if  a  2-dimensional 
array  could  be  maintained  which  contained  the  addresses  of  all  such  512-byte 
sized  reserved  spaces  and  a  flag  indicating  whether  they  are  active  or 
inactive.   (The  overhead  of  core  space  required  in  the  storage  of  this 
array  increases  with  the  number  of  spaces  reserved,  but  is  generally 
negligible.  ) 

A  better  performance  may  be  expected  if  an  idea  can  be  obtained 
about  the  number  of  buffers  of  size  512  which  are  generally  required  most 
of  the  time,  instead  of  the  maximum  number  which  may  be  needed  only  for 
a  very  short  time.   This  could  be  obtained  by  determining  the  time-weighted 
average  of  the  number  of  512-byte  sized  blocks  needed  simultaneously  in 
core.   For  a  typical  set  of  requests,  this  time-weighted  average  was  found 
to  be  close  to  2.   Thus  a  better  performance  may  be  expected  if  this 
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number  of  512-byte  spaces  were  preallocated.   The  allocation  policy  could 
then  be: 

i)  For  all  requests  of  size  512  >  look  for  an  inactive 
512-byte  sized  preallocated  space.   If  none  are 
available  go  to  the  general  area  and  look  for  a  big 
enough  space  by  the  first-fit  technique, 
ii)  For  all  other  requests  use  the  first- fit  technique 
in  the  general  area, 
iii)  For  deallocation  of  a  block  of  size  512,  coalesce  with 
neighboring  blocks,  as  in  standard  deallocation,  only 
if  the  block  is  not  a  preallocated  one. 
iv)  For  deallocation  of  a  block  of  size  512  in  the 
preallocated  area,  just  switch  the  tag  in  the 
2 -dimensional  array  to  "inactive. " 
This  algorithm  was  simulated  for  various  numbers  of  reserved 
blocks  of  size  512  and  most  of  the  surmises  above  were  found  true.   The 
results  are  tabulated  in  Table  5-2  and  depicted  in  the  graphs  of 
Figures  5.8  through  5.12. 

Figures  5.8  through  5.H  show  a  steady  deterioration  in  the 
perturbability  characteristics  as  the  number  of  preallocated  spaces  is 
increased  from  2  to  12.   The  reason  for  this,  as  mentioned  before,  is  the 
low  utilization  of  the  reserved  space.  Also,  the  average  time  taken  per 
allocation  is  seen  to  decrease  gradually  as  the  number  increases  and  at 
i  reserved  spaces,  the  time  taken  per  allocation  and  the  least  size  of 
biggest  block  are  both  down  by  33$. 


Table  5.2  Variation  of  Performance  with  Number  of 
Preallocated.  Spaces  of  Size  512 


73 


#  of  preallocated 

spaces  of  size  512 

0 

2 

U 

6 

8 

10 

12 

S 

3682 

3508 

3^62 

21+76 

lWt 

588 

- 

W 

19386 

1U988 

13352 

11528 

10728 

10152 

- 

T 

38500 

38600 

38700 

39700 

1+0700 

U1700 

I428OO 

P 

18$ 

IB* 

ui 

17$ 

12$ 

k% 

0$ 

t 

1+.101 

3.968 

3.381 

2.768 

2.508 

2.393 

2.005 

Ik 


Figure  5.8 


FIRST-FIT   ALGORITHM   WITH 

PREALLOCflTION   OF   SPACES  FOR 

REQUESTS  OF  SIZE   512 


2  4  6  8  10 

NUMBER   OF   PRERLLOCflTED  SPACES  (EACH  512  BYTES) 


12 


75 


CD- 
S' 


Figure  5-9 


FIRST-FIT  ALGORITHM  WITH 

PREALL0CATION   OF   SPACES  FOR 

REQUESTS  OF  SIZE  512 


_lo 
CLcd 


cc 


en 


O 

* 

O- 


+ 


-+ 


+ 


+ 


2         U         6         8         10 

NUMBER  OF  PREflLLCCflTED  SPRCES  (EACH  512  BYTES) 


12 


76 


3*' 


Figure  5 • 10 


FIRST-FIT  ALGORITHM  WITH 

PREALLOCATION  OF  SPACES  FOR 

REQUESTS  OF  SIZE  512 


4  6  8  10 

NUMBER   OF   PREALLQCflTED   SPACES  (EACH  512  BYTES) 


77 


o 
o 


■s> 


o 


;t 


Figure  5.11 


FIRST-FIT   ALGORITHM  WITH 

PREALLOCATION  OF  SPACES  FOR 

REQUESTS  OF  SIZE   512 


m 


CO 


UJ 

o 
cr 

QCo 

ljjLO 

Q 
LU 


S 


o 
o 


■+ 


■+ 


+ 


+ 


H 


2  4  6  8  10  12 

NUMBER  OF  PRERLLOCRTED  SPACES  (EACH  512  BYTES) 


78 


UJ 

CO 


cod 

UJOJ 

o 
o 


Li_o 


CO 


CO 

LU 


figure   5-12 

COMPARISON   OF   ALGORITHMS 


o 


FIRST-FIT  ALGORITHM 


0  FIRST-FIT  WITH  2   PREALLQCATED 
SPACES  FOR  REQUESTS  OF  SIZE  512 


^J.O 


4.0       8.0        12.0       16.0       20.0 

PERCENTAGE  DEGREE  OF  PERTURBATION 


24. 0 


79 


More  interesting,  however,  are  the  plots  on  figure  5.12.   The 
potentiality  of  the  algorithm  is  clearly  seen.  With  2  reserved  spaces,  the 
algorithm  seemingly  starts  inferior  to  the  first-fit  algorithm  but  picks 
up  and  surpasses  the  first-fit  algorithm  as  the  degree  of  perturbation 
increases  and  as  the  core  situation  gets  tighter. 

5.1+  Preallocation  of  32-byte  Sized  Spaces 

The  small  degree  of  success  obtained  by  use  of  the  algorithm  just 
described  leads  to  the  notion  that  it  is  quite  worthwhile  to  treat  very 
frequently  occurring  sizes  as  special  cases.  A  glance  again  at  Figure  3.3 
shows  that  the  size  32  is  a  good  candidate  for  experimentation.  Monitoring 
of  the  size  32  indicated  that  the  maximum  number  of  allocated  blocks  of 
that  size  at  any  given  time  is  10.   It  does  not  seem  necessary  to  determine 
a  time  weighted  average  in  this  case  because  the  total  size  of  preallocated 
core  is  only  320  bytes. 

The  first-fit  algorithm  was  then  simulated  after  reserving  10  spaces 
for  32-byte  sized  requests  and  a  varying  number  of  spaces  for  512-byte  sized 
requests.   The  results  are  tabulated  in  Table  5.3- 

Comparing  these  figures  with  those  shown  in  Table  5.2  it  is  seen 
that  the  degree  of  fragmentation  reduces  almost  imperceptibly  by 
preallocating  spaces  for  size  32.   The  reduction  in  time  taken  per 
allocation  is,  however,  immediately  obvious. 

To  prove  that  preallocation  of  a  large  number  of  spaces  of  small 
size  does  not  lead  to  excessive  fragmentation  (unlike  the  preallocation  of 
large- sized  sizes),  the  simulation  was  repeated  for  30  reserved  spaces  of 
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Table  5.3  Performance  for  10  Reserved  Spaces  of  Size  32 
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Table  5.4  Performance  for  30  Reserved  Spaces  of  Size  32 
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size  32.   (It  may  be  remembered  that  10  is  the  maximum  number  of  blocks 
of  size  32  required  by  most  programs. ) 

The  negligible  change  in  the  time  taken  per  allocation  is 
because  most  (probably  all)  of  the  32 -byte  sized  requests  are  satisfied 
from  the  preallocated  area  with  only  10  spaces.   The  extra  6^0  bytes  in 
this  area  are  never  used.   It  can  be  seen  from  the  table  and  from  graphs 
in  Figures  5-13  through  5.16  that  in  spite  of  the  tripling  of  the  reserved 
space,  the  increase  in  the  degree  of  fragmentation  is  not  too  significant. 
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Figure  5.13 
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6.   CONCLUSION 

It  is  surprising  that,  in  spite  of  the  fact  that  the  first-fit 
algorithm  appears  at  first  glance  to  be  "just  another  algorithm, "  an 
algorithm  without  any  apparent  structure  or  organization,  it  is  one  of  the 
most  robust  of  all  allocation  algorithms.  Other  algorithms  which  perform 
nearly  as  good  or  a  little  better  than  the  first-fit  algorithm  are 

i)  combination  of  best-fit  and  rover  algorithms 
ii)  modified  first- fit  using  preallocated  spaces  for 
request  sizes  512  and  32. 

Many  other  algorithms  described  in  this  thesis  cut  down 
tremendously  on  the  time  taken  per  allocation,  but  only  at  the  expense  of 
increased  fragmentation  of  memory.   If  memory  size  is  not  a  limitation, 
the  cycle  or  the  rover  algorithm  may  be  employed,  the  latter  having  an 
edge  over  the  former. 

For  any  system,  algorithms  may  be  found,  which  are  modifications 
of  the  commonly  used  algorithms,  but  which  perform  better  than  the  common 
algorithms.   It  is  emphasized  though,  that  the  specific  modifications  are 
dependent  on  the  system  or  the  type  of  system. 

Based  on  the  results  presented  in  this  thesis,  it  is  suggested 
that,  in  spite  of  the  fact  that  a  few  better  algorithms  exist,  the  existing 
first-fit  algorithm  be  retained  for  the  PASCAL  system.   It  appears  that 
the  request  sets  not  satisfied  by  the  first-fit  algorithms,  but  satisfied 
by  some  of  the  other  mentioned  algorithms,  are  so  few  that  they  do  not 
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warrant  a  change  in  the  allocation  strategy.      Further,   the  first-fit 
possesses  the  advantage  of  unbelievable   simplicity   (which  incidentally 
leads  to  smaller  core  requirements  for  the  allocation  routine). 

Only  half  the  importance  of  this  thesis  is  due  to  the  results 
just  mentioned.   The  other  half  should  go  to  the  tools  utilized  in  arriving 
at  these  results—specifically,  the  simulation  techniques,  the  monitoring 
techniques  and  the  performance  criteria.  Most  of  the  performance  criteria 
were  found  to  be  useful  indications  of  the  suitability  of  an  algorithm. 
Outstanding  among  those  used  to  measure  the  degree  of  fragmentation  were 

i)  the  least  size  of  the  biggest  area  of  free  core, 
ii)  the  mean  size  of  the  biggest  free  block,  and 
iii)  the  critically  tolerant  size. 

It  is   suggested  that  the  new  algorithms  mentioned  in  this  thesis 
be  tried  out  in  other  nonPASCAL-like  environments. 
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